| Name | Version | Summary | date |
| parsethisio |
0.2.3 |
A Python library to extract text from various sources for LLM preprocessing. |
2025-07-24 03:29:15 |
| text-extra |
0.1.4 |
A simple tool for text extraction from pdf, epub, txt, and docx files |
2024-03-26 18:45:03 |
| hotpdf |
0.4.6.1 |
Fast PDF Data Extraction library |
2024-02-22 15:45:53 |
| pictureTextCrop |
0.6.1 |
Interactive extraction of selected text from images and batch processing of stored image files. |
2024-01-04 05:34:56 |
| boilerpy3 |
1.0.7 |
Python port of Boilerpipe, for HTML boilerplate removal and text extraction |
2023-11-01 22:55:59 |
| easyocr-itgn |
1.2.3 |
Modified Easyorc By IntoThatGoodNight |
2023-07-23 17:05:21 |